AITopics | preliminary evaluation

Collaborating Authors

preliminary evaluation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

US regulators launch investigation into self-driving Teslas after series of crashes

The GuardianOct-9-2025, 11:45:28 GMT

The preliminary evaluation by NHTSA is the first step before potentially seeking a recall of the vehicles. The preliminary evaluation by NHTSA is the first step before potentially seeking a recall of the vehicles. US automobile safety regulators have opened an investigation into Tesla vehicles equipped with its full self-driving technology over traffic-safety violations after a series of crashes. The National Highway Traffic Safety Administration (NHTSA) said the electric carmaker's self driving assistance system, which requires drivers to pay attention and intervene if needed, had "induced vehicle behaviour that violated traffic safety laws". The preliminary evaluation by the NHTSA is the first step before potentially seeking a recall of the vehicles if it believes they pose a risk to safety.

investigation, tesla, vehicle, (11 more...)

The Guardian

Country:

Europe > Ukraine (0.07)
North America > United States > California (0.06)
Oceania > Australia (0.05)

Industry:

Transportation > Ground > Road (1.00)
Government > Regional Government > North America Government > United States Government (0.53)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

US investigates 2.4m Tesla self-driving vehicles after reported collisions

The GuardianOct-18-2024, 13:36:58 GMT

The US government's road safety agency has opened an investigation into 2.4m Tesla vehicles with the automaker's Full Self-Driving software after four reported collisions, including a fatal crash. The National Highway Traffic Safety Administration (NHTSA) on Friday said it was opening the preliminary evaluation after four reports of crashes where Full Self-Driving was engaged during reduced roadway visibility like sun glare, fog or airborne dust. In one crash "the Tesla vehicle fatally struck a pedestrian. One additional crash in these conditions involved a reported injury," NHTSA said. The investigation covers 2016-2024 Model S and X vehicles with the optional system as well as 2017-2024 Model 3, 2020-2024 Model Y, and 2023-2024 Cybertruck vehicles.

roadway visibility condition, tesla, vehicle, (8 more...)

The Guardian

Country: North America > United States (0.73)

Industry:

Transportation > Ground > Road (1.00)
Government > Regional Government > North America Government > United States Government (0.57)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

Scheherazade: Evaluating Chain-of-Thought Math Reasoning in LLMs with Chain-of-Problems

Miner, Stephen, Takashima, Yoshiki, Han, Simeng, Erata, Ferhat, Antonopoulos, Timos, Piskac, Ruzica, Shapiro, Scott J

arXiv.org Artificial IntelligenceOct-10-2024

Benchmarks are critical for measuring progress of math reasoning abilities of Large Language Models (LLMs). However, existing widely-used benchmarks such as GSM8K have been rendered less useful as multiple cutting-edge LLMs achieve over 94% accuracy. While harder benchmarks have been proposed, their creation is often manual and expensive. We present Scheherazade, an automated approach for producing challenging mathematical reasoning benchmarks by logically chaining mathematical reasoning problems. We propose two different chaining methods, forward chaining and backward chaining, which require reasoning forward and backward through the chain respectively. We apply Scheherazade on GSM8K to create GSM8K-Scheherazade and evaluate 3 frontier LLMs and OpenAI's o1-preview on it. We show that while frontier models' performance declines precipitously at only a few questions chained, a preliminary evaluation suggests o1-preview performance persists up to 5 questions chained backwards. In addition, while all other models perform worse when problems are chained backwards, o1-preview performs better on backward-chained benchmarks. We will release the dataset and code publicly.

large language model, machine learning, scheherazade, (18 more...)

arXiv.org Artificial Intelligence

2410.00151

Country: North America > United States (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

GPT as Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing

Lu, Hao, Niu, Xuesong, Wang, Jiyao, Wang, Yin, Hu, Qingyong, Tang, Jiaqi, Zhang, Yuting, Yuan, Kaishen, Huang, Bin, Yu, Zitong, He, Dengbo, Deng, Shuiguang, Chen, Hao, Chen, Yingcong, Shan, Shiguang

arXiv.org Artificial IntelligenceApr-10-2024

Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos. Despite its success in language understanding, it is critical to evaluate the performance of downstream tasks for better human-centric applications. This paper assesses the application of MLLMs with 5 crucial abilities for affective computing, spanning from visual affective tasks and reasoning tasks. The results show that \gpt has high accuracy in facial action unit recognition and micro-expression detection while its general facial expression recognition performance is not accurate. We also highlight the challenges of achieving fine-grained micro-expression recognition and the potential for further study and demonstrate the versatility and potential of \gpt for handling advanced tasks in emotion recognition and related fields by integrating with task-related agents for more complex tasks, such as heart rate estimation through signal processing. In conclusion, this paper provides valuable insights into the potential applications and challenges of MLLMs in human-centric computing. Our interesting examples are at https://github.com/EnVision-Research/GPT4Affectivity.

preliminary evaluation, psychologist, visual affective computing, (1 more...)

arXiv.org Artificial Intelligence

2403.05916

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.89)
Information Technology > Artificial Intelligence > Natural Language (0.87)

Add feedback

THUIR2 at NTCIR-16 Session Search (SS) Task

Su, Weihang, Li, Xiangsheng, Liu, Yiqun, Zhang, Min, Ma, Shaoping

arXiv.org Artificial IntelligenceJul-1-2023

Our team(THUIR2) participated in both FOSS and POSS subtasks of the NTCIR-161 Session Search (SS) Task. This paper describes our approaches and results. In the FOSS subtask, we submit five runs using learning-to-rank and fine-tuned pre-trained language models. We fine-tuned the pre-trained language model with ad-hoc data and session information and assembled them by a learning-to-rank method. The assembled model achieves the best performance among all participants in the preliminary evaluation. In the POSS subtask, we used an assembled model which also achieves the best performance in the preliminary evaluation.

artificial intelligence, language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2307.0025

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.17)
Asia > China > Beijing > Beijing (0.06)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Preliminary Evaluation of ChatGPT for Zero-shot Dialogue Understanding

Pan, Wenbo, Chen, Qiguang, Xu, Xiao, Che, Wanxiang, Qin, Libo

arXiv.org Artificial IntelligenceApr-9-2023

Zero-shot dialogue understanding aims to enable dialogue to track the user's needs without any training data, which has gained increasing attention. In this work, we investigate the understanding ability of ChatGPT for zero-shot dialogue understanding tasks including spoken language understanding (SLU) and dialogue state tracking (DST). Experimental results on four popular benchmarks reveal the great potential of ChatGPT for zero-shot dialogue understanding. In addition, extensive analysis shows that ChatGPT benefits from the multi-turn interactive prompt in the DST task but struggles to perform slot filling for SLU. Finally, we summarize several unexpected behaviors of ChatGPT in dialogue understanding tasks, hoping to provide some insights for future research on building zero-shot dialogue understanding systems with Large Language Models (LLMs).

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2304.04256

Country:

North America > United States > Pennsylvania (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Robofriend: An Adpative Storytelling Robotic Teddy Bear -- Technical Report

Glanz, Ido, Weksler, Matan, Karpas, Erez, Horowitz-Kraus, Tzipi

arXiv.org Artificial IntelligenceJan-4-2023

Language exposure at an early stage of development is critical for the facilitation of brain networks associated with language Kuhl [2004], Cardillo and Kuhl [2009], Moon et al. [2013]. Storytelling is one form of language exposure, which was found to be associated with a greater engagement not only in language processing but also in visualization and cognitive abilities in children Hutton et al. [2015]. Interestingly, it was suggested that it is not the storytelling itself that is related to these improvements, but it is the interaction during the stories that amplify these abilities in children Twait et al. [2019]. A recent study demonstrated how a group of 4-6-year-old children attending storytelling sessions interactively vs. a group attending non-interactively (storytelling sessions on the screen), shared greater cognitive and language abilities Twait et al. [2019]. Hence, a question was raised regarding this positive effect during interactive (dialogic) storytelling - is the positive effect due to the human interaction?

artificial intelligence, machine learning, robofriend, (18 more...)

arXiv.org Artificial Intelligence

2301.01576

Country: Asia > Middle East > Israel (0.05)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.54)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

First Known Tesla Autopilot Death Spurs Federal Investigation

Popular ScienceJul-1-2016, 00:42:21 GMT

We learned yesterday evening that NHTSA is opening a preliminary evaluation into the performance of Autopilot during a recent fatal crash that occurred in a Model S. This is the first known fatality in just over 130 million miles where Autopilot was activated. Among all vehicles in the US, there is a fatality every 94 million miles. Worldwide, there is a fatality approximately every 60 million miles. It is important to emphasize that the NHTSA action is simply a preliminary evaluation to determine whether the system worked according to expectations. Following our standard practice, Tesla informed NHTSA about the incident immediately after it occurred.

artificial intelligence, autopilot death spur federal investigation, preliminary evaluation, (4 more...)

Popular Science

Country: North America > United States (0.30)

Industry: Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.40)

Add feedback

Tesla's 'Autopilot' feature probed after fatal crash

USATODAY - Tech Top StoriesJul-1-2016, 00:41:45 GMT

A preliminary investigation has begun for a fatal car crash involving a Tesla Model S.According to the National Highway Traffic Safety Administration, the electric model sedan had Autopilot mode engaged when a driver was killed. The National Highway Traffic Safety Administration has opened a preliminary evaluation into the fatal crash of a Tesla electric car that had its "Autopilot" feature engaged at the time of the incident. NHTSA says in its filing that the crash was reported by Tesla and that its probe, part of a process that can eventually lead to a recall, centers on the car's self-driving feature. "This preliminary evaluation is being opened to examine the design and performance of any automated driving systems in use at the time of the crash," the safety agency said in a filing. The crash occurred when a tractor-trailer made a left turn in front of a 2015 Tesla on a highway near Williston, Fla., NHTSA said. The driver died due to injuries sustained in the accident.

artificial intelligence, fatal crash, tesla, (11 more...)

USATODAY - Tech Top Stories

Country: North America > United States > California > Alameda County > Fremont (0.06)

Industry: Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

Fatal crash of Tesla Model S in autopilot prompts 'preliminary evaluation' by federal officials

Los Angeles TimesJun-30-2016, 21:32:15 GMT

The National Highway Transportation Safety Board is opening a preliminary evaluation into Tesla's autopilot feature, after the fatal crash of a Model S that was in self-driving mode, the electric automaker said Thursday. According to a blog post from Tesla Motors Inc., the car was on a unnamed, divided highway when a tractor trailer drove across the road perpendicular to the Model S. "Neither Autopilot nor the driver noticed the white side of the tractor trailer against a brightly lit sky, so the brake was not applied," Tesla said in the post. The Model S passed under the trailer, with the bottom of the trailer impacting the windshield of the Model S, Tesla said. Tesla said this was the first fatality in which the autopilot feature was activated, with more than 130 million miles driven using that feature. The Palo Alto automaker said it informed NHTSA about the incident "immediately after it occurred."

artificial intelligence, autopilot prompt, preliminary evaluation, (7 more...)

Los Angeles Times

Country: North America > United States > California > Santa Clara County > Palo Alto (0.29)

Industry: Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.90)

Add feedback